Content Extraction from Marketing Flyers
نویسندگان
چکیده
The rise of online shopping has hurt physical retailers, which struggle to persuade customers to buy products in physical stores rather than online. Marketing flyers are a great mean to increase the visibility of physical retailers, but the unstructured offers appearing in those documents cannot be easily compared with similar online deals, making it hard for a customer to understand whether it is more convenient to order a product online or to buy it from the physical shop. In this work we tackle this problem, introducing a content extraction algorithm that automatically extracts structured data from flyers. Unlike competing approaches that mainly focus on textual content or simply analyze font type, color and text positioning, we propose novel and more advanced visual features that capture the properties of graphic elements typically used in marketing materials to attract the attention of readers towards specific deals, obtaining excellent results and a high language and genre independence.
منابع مشابه
Combining Visual and Textual Features for Information Extraction from Online Flyers
Information in visually rich formats such as PDF and HTML is often conveyed by a combination of textual and visual features. In particular, genres such as marketing flyers and info-graphics often augment textual information by its color, size, positioning, etc. As a result, traditional text-based approaches to information extraction (IE) could underperform. In this study, we present a supervise...
متن کاملSciFly - Customised Flyers on Demand
Methods: SciFly is a demonstration system that generates customised flyers about CSIRO’s research in Information and Communication Technologies, based on user-selected areas of interest. It has been designed to operate as a touch-screen information kiosk, but also has a web-interface to allow remote, browser-based interaction. In response to a user nominating their areas of interest, SciFly dyn...
متن کاملDigital Leafleting: Extracting Structured Data from Multimedia Online Flyers
Marketing materials such as flyers and other infographics are a vast online resource. In a number of industries, such as the commercial real estate industry, they are in fact the only authoritative source of information. Companies attempting to organize commercial real estate inventories spend a significant amount of resources on manual data entry of this information. In this work, we propose a...
متن کاملAn interactive media program for managing psychosocial problems on long-duration spaceflights.
Space crews must be self-reliant to complete long-duration missions successfully. This project involves the development and evaluation of a network of self-guided interactive multimedia programs to train and assist long-duration flyers in the prevention, assessment, and management of psychosocial problems that can arise on extended missions. The system is currently under development and is inte...
متن کاملTotal Phenolic Content, Total Flavonoid Content and Radical Scavenging Activity from Zingiber zerumbet Rhizome using Subcritical Water Extraction
Subcritical water extraction (SWE) is an alternative technique that implement water as a solvent. The objective of this work was to compare the efficiency of SWE in temperature range from 100ºC to 180ºC at extraction times ranging from 5 to 25 minutes with an ethanolic soxhlet extraction in terms of total phenolic content (TPC), total flavonoid content (TFC) and radical scavenging activity (RSA...
متن کامل